Report on the TREC-5 Confusion Track
نویسندگان
چکیده
For TREC retrieval from corrupted data was studied through retrieval of single target documents from a corpus which was corrupted by producing page images corrupting the bit maps and applying OCR techniques to the results In general methods which attempted a probabilistic estimation of the original clean text fare better than methods which simply accept corrupted versions of the query text
منابع مشابه
OCR Correction and Query Expansion for Retrieval on OCR Data -- CLARIT TREC-5 Confusion Track Report
متن کامل
The TREC-6 Spoken Document Retrieval Track
The Text REtrieval Conference (TREC) workshops provide a forum for di erent groups to compare retrieval systems on common retrieval tasks. The 1997 TREC workshop will feature a Spoken Document Retrieval task for the rst time. This paper motivates the task and describes the measures to be used to evaluate the e ectiveness of the retrieval methodologies. 1. The Text REtrieval Conference The Text ...
متن کاملRevisiting N-Gram Based Models for Retrieval in Degraded Large Collections
The traditional retrieval models based on term matching are not effective in collections of degraded documents (output of OCR or ASR systems for instance). This paper presents a n-gram based distributed model for retrieval on degraded text large collections. Evaluation was carried out with both the TREC Confusion Track and Legal Track collections showing that the presented approach outperforms ...
متن کاملThe TREC 2006 Terabyte Track
TREC 2006 is the third year for the track. The track was introduced as part of TREC 2004, with a single adhoc retrieval task. For TREC 2005, the track was expanded with two optional tasks: a named page finding task and an efficiency task. These three tasks were continued in 2006, with 20 groups submitting runs to the adhoc retrieval task, 11 groups submitting runs to the named page finding task...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996